NMS-19648: CircleCI: Store failed result of flaky tests, improve a test case#8404
NMS-19648: CircleCI: Store failed result of flaky tests, improve a test case#8404mershad-manesh wants to merge 38 commits intodevelopfrom
Conversation
This reverts commit 35548f2.
There was a problem hiding this comment.
Pull request overview
This PR focuses on improving CI reliability and debuggability around flaky tests by preserving failure evidence across retries, refining CircleCI smoke/itest distribution, and stabilizing a few UI component tests.
Changes:
- Update CircleCI smoke/itest scripts to preserve failing JUnit XMLs as “flaky evidence” before retrying, and improve test splitting/logging.
- Introduce a quarantined
smoke-test-flakyCircleCI job/suite and adjust existing smoke job parallelism/config behavior. - Stabilize several Vue/Vitest UI tests (dialog stubbing, cleanup improvements, and debounce handling) and harden Selenium clicking via retry/obscured-element checks.
Reviewed changes
Copilot reviewed 26 out of 26 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| ui/tests/components/EventConfigurationDetail/EventConfigEventTable.test.ts | Adds dialog mocking/cleanup and switches debounce verification to flush() approach. |
| ui/tests/components/EventConfiguration/EventConfigSourceTable.test.ts | Adds stubs for dialogs, improves teardown, and updates debounce tests to use flush(). |
| ui/tests/components/EventConfiguration/Dialog/UploadedFileRenameDialog.test.ts | Mocks FeatherDialog and improves teardown by closing dialog before unmount. |
| ui/tests/components/EventConfiguration/Dialog/CreateEventConfigurationDialog.test.ts | Mocks FeatherDialog and adjusts teardown/visibility handling to reduce flakiness. |
| ui/tests/components/EventConfigEventCreate/BasicInformation.test.ts | Ensures fake timers are always restored via try/finally and adds afterEach timer reset. |
| smoke-test/src/test/java/org/opennms/smoketest/GrafanaEndpointPageIT.java | Uses clickElement(By...) helper to reduce click flakiness. |
| smoke-test/src/main/java/org/opennms/smoketest/ui/framework/Toggle.java | Switches to retrying click helper for toggles. |
| smoke-test/src/main/java/org/opennms/smoketest/ui/framework/Element.java | Adds clickWithRetry with scroll + obscured-element detection. |
| smoke-test/src/main/java/org/opennms/smoketest/ui/framework/Button.java | Switches to retrying click helper for buttons. |
| smoke-test/src/main/java/org/opennms/smoketest/selenium/AbstractOpenNMSSeleniumHelper.java | Enhances clickElement with scroll + obscured-element detection. |
| .circleci/scripts/smoke.sh | Filters ITs by suite category, simplifies docker pre-pull, and preserves failing XMLs before retries. |
| .circleci/scripts/itest.sh | Adds node summary logging, preserves failing XMLs before retries, and increases db create threads. |
| .circleci/scripts/find-tests/git.py | Fails fast with a clearer error when .nightly lacks parent_branch. |
| .circleci/scripts/find-tests/find-tests.py | Fixes shebang and removes noisy debug printing. |
| .circleci/pyscripts/generate_main.py | Reduces workflow max_auto_reruns. |
| .circleci/main/workflows/workflows_v2.json | Adds workflow entry for quarantined flaky smoke tests. |
| .circleci/main/jobs/tests/smoke/smoke-test.index | Registers the new smoke-test-flaky job. |
| .circleci/main/jobs/tests/smoke/smoke-test-sentinel.yml | Increases sentinel smoke parallelism. |
| .circleci/main/jobs/tests/smoke/smoke-test-minion.yml | Increases minion smoke parallelism. |
| .circleci/main/jobs/tests/smoke/smoke-test-flaky.yml | Adds a quarantined flaky smoke job (allow failures, retries). |
| .circleci/main/jobs/tarball-assembly-only.yml | Adjusts resource defaults and Maven heap settings. |
| .circleci/main/commands/oci/trivy-analyze.yml | Removes a redundant/incorrect step from the Trivy analyze command. |
| .circleci/main/commands/generic/generic.yml | Avoids repeated apt installs by checking package presence first. |
| .circleci/main/commands/executions/run-smoke-tests.yml | Adds allow-failures mode and collects/stores flaky-evidence artifacts. |
| .circleci/main/commands/executions/run-integration-tests.yml | Collects/stores flaky-evidence artifacts for integration tests. |
| .circleci/main/commands/executions/run-build.yml | Ensures integration tests are skipped for the build-only compile step. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
smoke-test/src/main/java/org/opennms/smoketest/ui/framework/Element.java
Outdated
Show resolved
Hide resolved
smoke-test/src/main/java/org/opennms/smoketest/selenium/AbstractOpenNMSSeleniumHelper.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 26 out of 26 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| | xargs grep -L 'MinionTests\|SentinelTests\|FlakyTests' \ | ||
| | sed -e 's#^.*src/test/java/\(.*\)\.java#\1#' | tr "/" "." \ | ||
| > failsafe_classnames | ||
| ;; | ||
| minion) | ||
| circleci tests glob '**/src/test/java/**/*IT*.java' \ | ||
| | xargs grep -l 'MinionTests' \ | ||
| | sed -e 's#^.*src/test/java/\(.*\)\.java#\1#' | tr "/" "." \ | ||
| > failsafe_classnames | ||
| ;; | ||
| sentinel) | ||
| circleci tests glob '**/src/test/java/**/*IT*.java' \ | ||
| | xargs grep -l 'SentinelTests' \ | ||
| | sed -e 's#^.*src/test/java/\(.*\)\.java#\1#' | tr "/" "." \ | ||
| > failsafe_classnames | ||
| ;; | ||
| flaky) | ||
| circleci tests glob '**/src/test/java/**/*IT*.java' \ | ||
| | xargs grep -l 'FlakyTests' \ | ||
| | sed -e 's#^.*src/test/java/\(.*\)\.java#\1#' | tr "/" "." \ | ||
| > failsafe_classnames |
There was a problem hiding this comment.
find_tests() pipes circleci tests glob into xargs grep .... When grep -l/-L finds no matches (or when the glob yields no files), grep exits non-zero and, with set -e -o pipefail, this will abort the script even though an empty test list should be valid. Consider guarding this stage (e.g., use xargs -r and/or wrap the grep stage with || true in a subshell) so failsafe_classnames can be empty without failing the job.
| | xargs grep -L 'MinionTests\|SentinelTests\|FlakyTests' \ | |
| | sed -e 's#^.*src/test/java/\(.*\)\.java#\1#' | tr "/" "." \ | |
| > failsafe_classnames | |
| ;; | |
| minion) | |
| circleci tests glob '**/src/test/java/**/*IT*.java' \ | |
| | xargs grep -l 'MinionTests' \ | |
| | sed -e 's#^.*src/test/java/\(.*\)\.java#\1#' | tr "/" "." \ | |
| > failsafe_classnames | |
| ;; | |
| sentinel) | |
| circleci tests glob '**/src/test/java/**/*IT*.java' \ | |
| | xargs grep -l 'SentinelTests' \ | |
| | sed -e 's#^.*src/test/java/\(.*\)\.java#\1#' | tr "/" "." \ | |
| > failsafe_classnames | |
| ;; | |
| flaky) | |
| circleci tests glob '**/src/test/java/**/*IT*.java' \ | |
| | xargs grep -l 'FlakyTests' \ | |
| | sed -e 's#^.*src/test/java/\(.*\)\.java#\1#' | tr "/" "." \ | |
| > failsafe_classnames | |
| | xargs -r grep -L 'MinionTests\|SentinelTests\|FlakyTests' \ | |
| | sed -e 's#^.*src/test/java/\(.*\)\.java#\1#' | tr "/" "." \ | |
| > failsafe_classnames || true | |
| ;; | |
| minion) | |
| circleci tests glob '**/src/test/java/**/*IT*.java' \ | |
| | xargs -r grep -l 'MinionTests' \ | |
| | sed -e 's#^.*src/test/java/\(.*\)\.java#\1#' | tr "/" "." \ | |
| > failsafe_classnames || true | |
| ;; | |
| sentinel) | |
| circleci tests glob '**/src/test/java/**/*IT*.java' \ | |
| | xargs -r grep -l 'SentinelTests' \ | |
| | sed -e 's#^.*src/test/java/\(.*\)\.java#\1#' | tr "/" "." \ | |
| > failsafe_classnames || true | |
| ;; | |
| flaky) | |
| circleci tests glob '**/src/test/java/**/*IT*.java' \ | |
| | xargs -r grep -l 'FlakyTests' \ | |
| | sed -e 's#^.*src/test/java/\(.*\)\.java#\1#' | tr "/" "." \ | |
| > failsafe_classnames || true |
| // Before debounce time | ||
| it('does not call store immediately on input (debounce)', () => { | ||
| wrapper.vm.onChangeSearchTerm('test') | ||
| expect(store.onChangeEventsSearchTerm).not.toHaveBeenCalled() |
There was a problem hiding this comment.
This debounce test calls onChangeSearchTerm('test') but never flushes/cancels the debounced function. With real timers enabled, the pending debounce may fire after the test completes and bleed into subsequent tests (mock call counts/state). Cancel (or flush) the debounced function at the end of this test or in afterEach (e.g., onChangeSearchTerm.cancel()).
| expect(store.onChangeEventsSearchTerm).not.toHaveBeenCalled() | |
| expect(store.onChangeEventsSearchTerm).not.toHaveBeenCalled() | |
| wrapper.vm.onChangeSearchTerm.cancel() |
synqotik
left a comment
There was a problem hiding this comment.
Overall looks good, just one change to make.
| element); | ||
|
|
||
| return result instanceof Boolean && (Boolean)result; | ||
| } |
There was a problem hiding this comment.
Agree with this (Copilot comment), isCenterPointObscured should only appear once. I'm thinking make it a public static function in Element, which would actually look more like the implementation here. The Javascript checks if the element is null so you don't have to do that explicitly in the Java code.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 27 out of 27 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| | xargs -I{} cp {} "${FLAKY_EVIDENCE_DIR}/" | ||
| # Now delete originals so fresh results are written by the retry | ||
| find . \( -path "*/failsafe-reports/TEST-*.xml" -o -path "*/surefire-reports/TEST-*.xml" \) \ | ||
| -exec grep -l -E 'failures="[1-9]|errors="[1-9]' {} + 2>/dev/null \ | ||
| | xargs rm -f |
There was a problem hiding this comment.
The cp/rm pipelines use xargs without -r/--no-run-if-empty, which can still invoke cp/rm with no arguments when no failing XMLs are found (leading to noisy errors, even though set +e prevents failure). Consider adding xargs -r (and optionally -0 with find -print0) to make this robust and quiet.
| | xargs -I{} cp {} "${FLAKY_EVIDENCE_DIR}/" | |
| # Now delete originals so fresh results are written by the retry | |
| find . \( -path "*/failsafe-reports/TEST-*.xml" -o -path "*/surefire-reports/TEST-*.xml" \) \ | |
| -exec grep -l -E 'failures="[1-9]|errors="[1-9]' {} + 2>/dev/null \ | |
| | xargs rm -f | |
| | xargs -r -I{} cp {} "${FLAKY_EVIDENCE_DIR}/" | |
| # Now delete originals so fresh results are written by the retry | |
| find . \( -path "*/failsafe-reports/TEST-*.xml" -o -path "*/surefire-reports/TEST-*.xml" \) \ | |
| -exec grep -l -E 'failures="[1-9]|errors="[1-9]' {} + 2>/dev/null \ | |
| | xargs -r rm -f |
ui/tests/components/EventConfigurationDetail/EventConfigEventTable.test.ts
Outdated
Show resolved
Hide resolved
ui/tests/components/EventConfiguration/EventConfigSourceTable.test.ts
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 27 out of 27 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| final WebDriverWait wait = new WebDriverWait(driver, Duration.ofSeconds(10), Duration.ofMillis(200)); | ||
| wait.ignoring(StaleElementReferenceException.class).until(webDriver -> { | ||
| final WebElement element = webDriver.findElement(by); | ||
| ((JavascriptExecutor)webDriver).executeScript("arguments[0].scrollIntoView({block: 'center', inline: 'center'});", element); | ||
|
|
There was a problem hiding this comment.
clickWithRetry uses webDriver.findElement(by) inside WebDriverWait.until(...) but does not ignore NoSuchElementException. Since implicit wait is set to 0, a briefly-missing element will throw immediately and abort the wait instead of retrying, which can reintroduce flakiness. Consider adding NoSuchElementException to the ignored exceptions (and/or using an ExpectedCondition that returns false/null when the element is absent).
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
All Contributors
External References